Search CORE

58 research outputs found

Deep image representations for instance search

Author: Mohedano Robles Eva
Publication venue: Dublin City University. INSIGHT Centre for Data Analytics
Publication date: 01/11/2017
Field of study

We address the problem of visual instance search, which consists to retrieve all the images within an dataset that contain a particular visual example provided to the system. The traditional approach of processing the image content for this task relied on extracting local low-level information within images that was “manually engineered” to be invariant to di↵erent image conditions. One of the most popular approaches uses the Bag of Visual Words (BoW) model on the local features to aggregate the local information into a single representation. Usually, a final reranking stage is included in the pipeline to refine the search results. Since the emergence of deep learning as the dominant technique in computer vision in 2012, much research attention has been focused on deriving image representations from Convolutional Neural Networks (CNN) models for the task of instance search as a “data driven” approach to designing image representations. However, one of the main challenges in the instance search task is the lack of annotated datasets to fit CNN models parameters. This work explores the capabilities of descriptors derived from pre-trained CNN models for image classification to address the task of instance retrieval. First, we conduct an investigation of the traditional bag of visual words encoding on local CNN features to produce a scalable image retrieval framework that generalizes well across di↵erent retrieval domains. Second, we propose to improve the capacity of the obtained representations by exploring an unsupervised fine-tuning strategy that allow us to obtain better performing representations at the price of losing the generalization of the representations. Finally, we propose using visual attention models to weight the contribution of the relevant parts of an image to obtain a very powerful image representation for instance retrieval without requiring the construction of a large and suitable training dataset for fine-tuning CNN architectures

Irish Universities

DCU Online Research Access Service

Simple vs complex temporal recurrences for video saliency prediction

Author: Giró-i-Nieto Xavier
Linardos Panagiotis
McGuinness Kevin
Mohedano Eva
Nieto Juan Jose
O'Connor Noel E.
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2019
Field of study

This paper investigates modifying an existing neural network architecture for static saliency prediction using two types of recurrences that integrate information from the temporal domain. The first modification is the addition of a ConvLSTM within the architecture, while the second is a conceptually simple exponential moving average of an internal convolutional state. We use weights pre-trained on the SALICON dataset and fine-tune our model on DHF1K. Our results show that both modifications achieve state-of-the-art results and produce similar saliency maps. Source code is available at https://git.io/fjPiB

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Irish Universities

DCU Online Research Access Service

Informed perspectives on human annotation using neural signals

Author: AD Gerson
Eva Mohedano
J Polich
MS Lew
T Deselaers
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 06/01/2016
Field of study

In this work we explore how neurophysiological correlates related to attention and perception can be used to better understand the image-annotation task. We explore the nature of the highly variable labelling data often seen across annotators. Our results indicate potential issues with regard to ‘how well’ a person manually annotates images and variability across annotators. We propose such issues arise in part as a result of subjectively interpretable instructions that may fail to elicit similar labelling behaviours and decision thresholds across participants. We find instances where an individual’s annotations differ from a group consensus, even though their EEG (Electroencephalography) signals indicate in fact they were likely in consensus with the group. We offer a new perspective on how EEG can be incorporated in an annotation task to reveal information not readily captured using manual annotations alone. As crowd-sourcing resources become more readily available for annotation tasks one can reconsider the quality of such annotations. Furthermore, with the availability of consumer EEG hardware, we speculate that we are approaching a point where it may be feasible to better harness an annotators time and decisions by examining neural responses as part of the process. In this regard, we examine strategies to deal with inter-annotator sources of noise and correlation that can be used to understand the relationship between annotators at a neural level

Crossref

Irish Universities

DCU Online Research Access Service

Exploring EEG for Object Detection and Retrieval

Author: Giró-i-Nieto Xavier
Healy Graham
McGuinness Kevin
Mohedano Eva
O'Connor Noel
Porta Sergi
Salvador Amaia
Smeaton Alan F.
Publication venue
Publication date: 01/01/2015
Field of study

This paper explores the potential for using Brain Computer Interfaces (BCI) as a relevance feedback mechanism in content-based image retrieval. We investigate if it is possible to capture useful EEG signals to detect if relevant objects are present in a dataset of realistic and complex images. We perform several experiments using a rapid serial visual presentation (RSVP) of images at different rates (5Hz and 10Hz) on 8 users with different degrees of familiarization with BCI and the dataset. We then use the feedback from the BCI and mouse-based interfaces to retrieve localized objects in a subset of TRECVid images. We show that it is indeed possible to detect such objects in complex images and, also, that users with previous knowledge on the dataset or experience with the RSVP outperform others. When the users have limited time to annotate the images (100 seconds in our experiments) both interfaces are comparable in performance. Comparing our best users in a retrieval task, we found that EEG-based relevance feedback outperforms mouse-based feedback. The realistic and complex image dataset differentiates our work from previous studies on EEG for image retrieval.Comment: This preprint is the full version of a short paper accepted in the ACM International Conference on Multimedia Retrieval (ICMR) 2015 (Shanghai, China

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

Irish Universities

DCU Online Research Access Service

Temporal saliency adaptation in egocentric videos

Author: Cherto Monica
Giró-i-Nieto Xavier
Gurrin Cathal
Mohedano Eva
Panagiotis Linardos
Publication venue
Publication date: 04/09/2018
Field of study

This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video. We compute the saliency map for each video frame, firstly with an off-the-shelf model trained from static images, secondly by adding a a convolutional or conv-LSTM layers trained with a dataset for video saliency prediction. We study each configuration on EgoMon, a new dataset made of seven egocentric videos recorded by three subjects in both free-viewing and task-driven set ups. Our results indicate that the temporal adaptation is beneficial when the viewer is not moving and observing the scene from a narrow field of view. Encouraged by this observation, we compute and publish the saliency maps for the EPIC Kitchens dataset, in which view- ers are cooking

arXiv.org e-Print Archive

Irish Universities

DCU Online Research Access Service

Nexos entre tecnología y filosofía: el caso específico del ecosistema del metaverso

Author: Grimaldi D'Esdra Sophie
Hernández Ruiz Victoria
Mohedano Martínez José Miguel
Ramón Reyero Eva
Publication venue: HUMAN REVIEW. International Humanities Review / Revista Internacional De Humanidades
Publication date: 01/01/2022
Field of study

El desarrollo de la tecnología y de la inteligencia artificial provocan desde hace años preguntas éticas y antropológicas. La Suma Teológica de Santo Tomás permite categorizar términos tecnológicos y establecer ontologías en clave filosófica. Con esta metodología de la Suma se va a estudiar aquí el término “metaverso” y sus implicaciones éticas. El artículo aporta además una revisión sistemática de literatura científica para elaborar un mapa de conocimiento desde un punto de vista cualitativo y cuantitativo que visibilice el impacto de esta terminología emergente en la investigaciónpost-print702 K

DDFV

Bags of local convolutional features for scalable instance search

Author: Giró-i-Nieto Xavier
Marqués Ferran
McGuinness Kevin
Mohedano Eva
O'Connor Noel E.
Salvador Amaia
Publication venue
Publication date: 01/01/2016
Field of study

This work proposes a simple instance retrieval pipeline based on encoding the convolutional features of CNN using the bag of words aggregation scheme (BoW). Assigning each local array of activations in a convolutional layer to a visual word produces an assignment map, a compact representation that relates regions of an image with a visual word. We use the assignment map for fast spatial reranking, obtain- ing object localizations that are used for query expansion. We demonstrate the suitability of the BoW representation based on local CNN features for instance retrieval, achieving competitive performance on the Oxford and Paris buildings benchmarks. We show that our proposed system for CNN feature aggregation with BoW outperforms state-of-the-art techniques using sum pooling at a subset of the challenging TRECVid INS benchmark

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

Irish Universities

DCU Online Research Access Service